9 - Artificial Intelligence II [ID:52581]

50 von 698 angezeigt

Okay, then let's start, I guess. I see numbers are continuing to dwindle. We talked about

the value of information. The main thing here that you should remember is basically this

formula, i.e. we determine the value of information with respect to some particular random variable

f by the expected utility of the action that maximizes expected utility given that we know

that that random variable has a particular value f. I know that's a mouthful, but like

this general structure of the first part of the equation should be somewhat clear, I guess,

minus the expected utility we have if we don't know the value of that particular random variable

at all, i.e. basically just what we assume to gain in terms of utility from knowing that

that particular random variable has a particular value. And then apart from a couple of properties

that that has in particular that dpi is not additive, i.e. knowing the value of two random

variables is not just the sum of the values of the two distinct random variables. We know it's

always bigger than or at least zero, so non-negative. Apart from that, we talked about some examples to

gain a bit of intuition when value of information is actually large for some notion of large,

i.e. when I already know that some particular action is the one that is likely expected to

yield the better utility than information about the particular random variables involved is worth

rather little. We already know U1 in this particular case is the better choice anyway. If

they are very close together and have very narrow ranges in the first place, then 2. The amount of

utility we expect to gain from actually knowing the values of those particular random variables

is rather little. So the cases in which we are expected to actually gain utility from gathering

information usually look like the case in the middle where we have actions whose utilities to

some extent overlap and have a rather broad range in the first place so that knowing the value of

let's say the action that leads to U2 can be pinned down further than just this rather broad

distribution in the first place. We've talked about a very simple implementation of when to decide,

when to gather information in the first place, i.e. we basically just check all the random variables

where we can gain information about in the first place, compare their value of perfect information

with their cost and then if we find one of those where or the maximal one where the value of perfect

information is actually bigger than the cost of gathering that information in the first place,

then we just do that otherwise we just pick the best action we would have without gathering

information. That particular algorithm we call myopic in the sense that we really just consider

the random variable that is expected to maximize utility given that we know their value and if

it's not a single one where that is the case then we just act immediately i.e. what this algorithm

does not do is consider potentially gathering more information about other random variables

subsequently and so on and so forth. So that's not an ideal algorithm but it's like a good starting

point in the first place. There are strategies that are non-myopic but we're not going to go

into detail on those here. Apart from that we started talking about stochastic processes where

the assumption is that we have a sequence of random variables indexed by some time structure

where by time structure we basically in practice always ever just mean the natural numbers anyway

and these random variables all have the same domain. So the usual way that you should think

about this is we have one random variable at each particular time step in time. We have a couple of

notations in particular this one whenever we write some random variable x with a lower index

of a colon b for in our case natural numbers a and b we mean the sequence of random variables of xa,

xa plus 1, xa plus 2 and so on and so forth up until xb and just to keep the slides from

overflowing horizontally when we want to assign values to these sequences we just write it as an

upper index this equals e just to signify that for these particular random variables we assume

we actually know their values at some time step. Does that make sense so far? We're gonna look at

a couple of examples anyway. The random the running example that we're going to use is this

one where we assume we are some kind of security guard in an underground facility where we don't

actually observe the weather at all. The only possible evidence we have for what the weather

is currently doing is whether the director of that particular facility comes in with an umbrella or

doesn't. That gives us two stochastic processes one for whether it rains or not so we have a

Teil einer Videoserie :

Artificial Intelligence II

Presenters

Dr.-Ing. Dennis Müller

Zugänglich über

Offener Zugang

Dauer

01:26:44 Min

Aufnahmedatum

2024-05-16

Hochgeladen am

2024-05-16 21:39:03

Sprache

en-US

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/52581

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/52581&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Audio

Per RSS abonnieren